World-Wide Web (URL, previously "Universal") A
standard way of specifying the location of an object, typically a
web page, on the
Internet. Other types of object are described below. URLs are the form of address used on the
World-Wide Web. They are used in
HTML documents to specify the target of a
hypertext link which is often another HTML document (possibly stored on another computer).
Here are some example URLs:
https://w3.org/default.html https://acme.co.uk:8080/images/map.gif https://foldoc.org/?Uniform+Resource+Locator https://w3.org/default.html#Introduction ftp://wuarchive.wustl.edu/mirrors/msdos/graphics/gifkit.zip ftp://spy:secret@ftp.acme.com/pub/topsecret/weapon.tgz mailto:fred@doc.ic.ac.uk news:alt.hypertext telnet://dra.com
The part before the first colon specifies the access scheme or
protocol. Commonly implemented schemes include:
ftp,
http (World-Wide Web),
gopher or
WAIS. The "file" scheme should only be used to refer to a file on the same host. Other less commonly used schemes include
news,
telnet or mailto (
e-mail).
The part after the colon is interpreted according to the access scheme. In general, two slashes after the colon introduce a
hostname (host:port is also valid, or for
FTP user:passwd@host or user@host). The
port number is usually omitted and defaults to the standard port for the scheme, e.g. port 80 for HTTP.
For an HTTP or FTP URL the next part is a
pathname which is usually related to the pathname of a file on the server. The file can contain any type of data but only certain types are interpreted directly by most
browsers. These include
HTML and images in
gif or
jpeg format. The file's type is given by a
MIME type in the HTTP headers returned by the server, e.g. "text/html", "image/gif", and is usually also indicated by its
filename extension. A file whose type is not recognised directly by the browser may be passed to an external "viewer"
application, e.g. a sound player.
The last (optional) part of the URL may be a query string preceded by "?" or a "fragment identifier" preceded by "#". The later indicates a particular position within the specified document.
Only alphanumerics, reserved characters (:/?#"%+) used for their reserved purposes and "$", "-", "_", ".", "&", "+" are safe and may be transmitted unencoded. Other characters are encoded as a "%" followed by two
hexadecimal digits. Space may also be encoded as "+". Standard
SGML "&
;" character entity encodings (e.g. "é") are also accepted when URLs are embedded in HTML. The terminating semicolon may be omitted if & is followed by a non-letter character.
The authoritative W3C URL specification (https://w3.org/hypertext/WWW/Addressing/Addressing.html).
(2000-02-17)